Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3266 +/- ##
==========================================
- Coverage 54.28% 54.18% -0.10%
==========================================
Files 134 134
Lines 12356 10247 -2109
==========================================
- Hits 6707 5552 -1155
+ Misses 5152 4190 -962
- Partials 497 505 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
e5b1b08 to
766e1ff
Compare
maxsmythe
left a comment
There was a problem hiding this comment.
Minor comments, but very nice!
pkg/controller/constrainttemplate/constrainttemplate_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/constrainttemplate/constrainttemplate_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/constrainttemplate/constrainttemplate_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/constrainttemplate/constrainttemplate_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/constrainttemplate/constrainttemplate_controller.go
Outdated
Show resolved
Hide resolved
8b5048a to
90423db
Compare
pkg/controller/constrainttemplate/constrainttemplate_controller_test.go
Outdated
Show resolved
Hide resolved
1792445 to
cf29d69
Compare
| } | ||
|
|
||
| // via the registrar below. | ||
| constraintEvents := make(chan event.GenericEvent, 1024) |
There was a problem hiding this comment.
I see.
Where is the constraint controller reading from constraintEvents? Should we just be writing to regEvents instead?
If we want to use a second channel, we'd need to plumb it through.
2f860cc to
f5ea086
Compare
|
@maxsmythe @sozercan PTAL when you get a chance |
| return reconcile.Result{}, err | ||
| } | ||
| } | ||
| if !reflect.DeepEqual(currentVap, newVap) { |
There was a problem hiding this comment.
nit: I'd make this an else if statement, otherwise we make two writes to the API server instead of one (on account of currentVAP being nil, guaranteeing reflect.DeepEqual() will be false.
| return reconcile.Result{}, err | ||
| } | ||
| } | ||
| if !reflect.DeepEqual(currentVapBinding, newVapBinding) { |
There was a problem hiding this comment.
nit: I'd make this an else if statement, otherwise we make two writes to the API server instead of one (on account of currentVAP being nil, guaranteeing reflect.DeepEqual() will be false.
| return reconcile.Result{}, err | ||
| } | ||
|
|
||
| // after vap is updated, trigger update event for all constraints |
There was a problem hiding this comment.
I think we actually want to do this after create/delete (not update), since we want to re-reconcile constraints if the "use VAP" annotation ever changes (so constraints can re-evaluate whether their default behavior should change).
On update "use VAP" was true and will continue to be true. On create/delete "use VAP" has changed (or the CT is brand new/just deleted)
| } | ||
| // do not generate vap resources | ||
| // remove if exists | ||
| if !generateVap && constraint.IsVapAPIEnabled() { |
There was a problem hiding this comment.
We should add the "listObjects/replay constraints" behavior here as well (might make sense to factor that out into its own function to avoid code duplication)
There was a problem hiding this comment.
Thinking more deeply about this, I think we have a general race condition. Let's pretend:
- constraint controller is fast
- CT controller is slow (likely, since it may require compilation of Rego/CEL)
- there is some constraint Foo
- It is backed by some template FooTemplate
- there are 3 events for FooTemplate: no-generate-vap, generate-vap, no-generate-vap
- FooTemplate reconciler is slow enough that only two events get reconciled: no-generate-vap, no-generate-vap
- constraint controller gets a reconcile while generate-vap is set
If this sequence of events happens, you could get a dangling VAP binding b/c when constraint controller reconciles, it's gonna run client.Get() on the informer cache and see the annotation to create a binding, but the constraint template controller sees no change (from its view no-generate-vap has always been set), so it will not trigger a reconcile.
I think I see a way out of this, but it's not super pretty:
- create a shared cache between the constraint template and constraint controllers that stores the CT controller's opinion as to whether a constraint kind should/should not generate VAP objects
- create a function on the CT reconciler called
shouldGenerateVAP(kind string, shouldGenerate bool), whenshouldGeneratechanges for a particular kind, that's what updates the cache and triggers a re-reconcile of all constraints for a given template - call
shouldGenerateVAP()where we are currently running list calls (might need to think about the timing) - The constraint controller reads from the cache, rather than the object
- Because the results of reconciliation are dependent on internal state, we should run reconciliation as a singleton process (i.e. throw it on the status/audit Pod), this will prevent thrashing of the k8s resources whenever the pods' watch events are out of sync
(5) is probably a good thing to address generally, since there is also a chance of thrashing on G8r upgrade. I'd suggest that any controller that creates a downstream object (CRDs, VAP objects), the creation runs as a singleton (we can carve it out as an operation, pehaps, like audit or status updates). The controllers should still run for the purposes of caching state (e.g. loading constraint templates into CF), and they should still write out pod status objects.
This is medium-to-large body of work. I would recommend we do this before graduating to beta (or at least before allowing users to set failure policy to Fail). However, I don't think we need to block the current PR on this because:
- This is an in-development feature that is off-by-default
- We currently fail open, so a dangling binding will not cause any issues
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
f5ea086 to
76f216f
Compare
maxsmythe
left a comment
There was a problem hiding this comment.
LGTM, but see my caveat about a race condition above.
| } | ||
| // do not generate vap resources | ||
| // remove if exists | ||
| if !generateVap && constraint.IsVapAPIEnabled() { |
There was a problem hiding this comment.
Thinking more deeply about this, I think we have a general race condition. Let's pretend:
- constraint controller is fast
- CT controller is slow (likely, since it may require compilation of Rego/CEL)
- there is some constraint Foo
- It is backed by some template FooTemplate
- there are 3 events for FooTemplate: no-generate-vap, generate-vap, no-generate-vap
- FooTemplate reconciler is slow enough that only two events get reconciled: no-generate-vap, no-generate-vap
- constraint controller gets a reconcile while generate-vap is set
If this sequence of events happens, you could get a dangling VAP binding b/c when constraint controller reconciles, it's gonna run client.Get() on the informer cache and see the annotation to create a binding, but the constraint template controller sees no change (from its view no-generate-vap has always been set), so it will not trigger a reconcile.
I think I see a way out of this, but it's not super pretty:
- create a shared cache between the constraint template and constraint controllers that stores the CT controller's opinion as to whether a constraint kind should/should not generate VAP objects
- create a function on the CT reconciler called
shouldGenerateVAP(kind string, shouldGenerate bool), whenshouldGeneratechanges for a particular kind, that's what updates the cache and triggers a re-reconcile of all constraints for a given template - call
shouldGenerateVAP()where we are currently running list calls (might need to think about the timing) - The constraint controller reads from the cache, rather than the object
- Because the results of reconciliation are dependent on internal state, we should run reconciliation as a singleton process (i.e. throw it on the status/audit Pod), this will prevent thrashing of the k8s resources whenever the pods' watch events are out of sync
(5) is probably a good thing to address generally, since there is also a chance of thrashing on G8r upgrade. I'd suggest that any controller that creates a downstream object (CRDs, VAP objects), the creation runs as a singleton (we can carve it out as an operation, pehaps, like audit or status updates). The controllers should still run for the purposes of caching state (e.g. loading constraint templates into CF), and they should still write out pod status objects.
This is medium-to-large body of work. I would recommend we do this before graduating to beta (or at least before allowing users to set failure policy to Fail). However, I don't think we need to block the current PR on this because:
- This is an in-development feature that is off-by-default
- We currently fail open, so a dangling binding will not cause any issues
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, using
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when the PR gets merged):Fixes #2874
Special notes for your reviewer: